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Abstract 

Theoretical models of neuronal function consider different mech¬ 
anisms through which networks learn, classify and discern inputs. A 
central focus of these models is to understand how associations are 
established amongst neurons, in order to predict spiking patterns that 
are compatible with empirical observations. Although these models 
have led to major insights and advances, they still do not account for 
the astonishing velocity with which the brain solves certain problems 
and what lies behind its creativity, amongst others features. We ex¬ 
amine two important components that may crucially aid comprehen¬ 
sive understanding of said neurodynamical processes. First, we argue 
that once presented with a problem, different putative solutions are 
generated in parallel by different groups or local neuronal complexes, 
with the subsequent stabilization and spread of the best solutions. 
Using mathematical models we show that this mechanism accelerates 
finding the right solutions. This formalism is analogous to standard 
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replicator-mutator models of evolution where mutation is analogous 
to the probability of neuron state switching (on/off). Although in evo¬ 
lution mutation rates are constant, we show that neuronal switching 
probability is determined by neuronal activity and their associative 
weights, described by the network of synaptic connections. The sec¬ 
ond factor that we incorporate is structural synaptic plasticity, i.e. 
the making of new and disbanding of old synapses, which we apply 
as a dynamical reorganization of synaptic connections. We show that 
Hebbian learning alone does not suffice to reach optimal solutions. 
However, combining it with parallel evaluation and structural plastic¬ 
ity opens up possibilities for efficient problem solving. In the resulting 
networks, topologies converge to subsets of fully connected compo¬ 
nents. Imposing costs on synapses reduces the connectivity, although 
the number of connected components remains robust. The average 
lifetime of synapses is longer for connections that are established early, 
and diminishes with synaptic cost. 


1 Introduction 

Many mechanisms of cognition, memory and other aspects of brain function 
remain unclear. It is acknowledged that associations build up by updat¬ 
ing synapses between neurons that spike (nearly) synchronously to a given 
stimulus. In this way some neuronal circuits can predispose or anticipate 
a response to similar stimuli by retrieving information stored in synaptic 
weights. Synaptic weights may in turn be systematically altered by success¬ 
ful anticipation or recognition activity. At the same time, given the mul¬ 
tidimensional space of alternative neuronal circuits and spiking sequences, 
undirected random variation in circuitry and spiking are extremely unlikely 
to produce better solutions for each new problem. 

The connectivity overall of the human brain is sparse where, roughly, 10 11 
neurons are estimated to connect through some 10 15 synapses. Learning and 
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cognition have been understood in terms of changes in associative weights 
on networks of fixed topology. However, the discovery that rewiring this 
network is not uncommon even in adult brains challenges the former views 
regarding the mechanisms of learning. This rewiring, known as structural 
synaptic plasticity (SSP), has been well documented experimentally 01 - 
However, neither the full consequences nor the central role of SSP have been 
fully clarified. Yet, it is not only reasonable, but also supporting evidence 
exists, that SSP can encode information |3|. Thus, associative weights and 
SSP are two mechanisms that have an effect on learning. These need not 
be mutually exclusive; rather, as we show in this article, they both seem to 
be necessary for different stages of learning, such as short and long term, 
respectively. 

Our knowledge about what determines the establishment of new synapses 
is still limited, especially taking into account the sparseness and dimensions 
of the brain. Neither synaptic weights nor SSP explain on their own vari¬ 
ability in circuitry associated with a particular stimulus. As such, they only 
show variability in time. If trial solutions to a problem (such as learning 
or recognizing a pattern) rely on serial evaluations, SSP is a poor candidate 
mechanism, even for long-term learning. Under serial evaluations the time 
for establishing new synapses would be prohibitively large to account for 
randomly testing connections amongst pairs of neurons. 

Changeux J4[|5] and Edelman |6| proposed a selectionist |7| framework 
for brain function. They noted that selection acts, through preferentially 
reinforcing and stabilizing some synaptic patterns over others, and through 
the elimination of dysfunctional neurons and neuronal connections. Although 
these ideas are correct, they are incomplete because they only consider the 
fate of initial topological variability in circuitry, thought to occur only during 
development. In their framework, selection acts on this standing variation, 
stabilizing functional circuits that remain unchanged throughout life, with 
later learning and problem solving resulting only from changing synaptic 
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weights. In this sense, the role of selection is limited to establishing functional 
neuronal network at early stages. The ideas that we investigate in this article 
go beyond this view: we consider that selection of novel variation plays an 
active role in learning through life. 

Kilgard proposed a verbal model that accounts for circuitry variation 
during learning periods |8|. In his ‘expansion-renormalization model’ he en¬ 
visions that SSP accounts for such variation. The mechanism is as follows. 
When a cortical subnetwork is challenged by a novel task, new synapses are 
being generated in response, out of which only the functionally important 
ones are kept, while the obsolete ones are eliminated. This is like an iterated 
Changeux-type overproduction-selective stabilisation mechanism, and is be¬ 
ing explicitly regarded as a Darwinian mechanism by the author. However, 
he fails to discuss particulars such as: what are the true units of variation, 
and how this mechanism quantitatively acts. Our ideas are conceptually 
similar, but we pin them down to specific ‘learning’ units and develop quan¬ 
titative models to understand how this variability is generated and how it 
affects learning. 

We note that there are at least two other sources of neuronal variabil¬ 
ity. The first one is the variance in spiking patterns and is due only to the 
stochastic behaviour of neurons (cf. |9j[l0j). The second one, which is more 
fundamental, is due to SSP, which acts by rewiring the set of neurons in a 
complex. Selection is then able to act on the variation that is generated by 
the three mechanisms. We point out that the crucial one is SSP, but as we 
will explain throughout this article, the three mechanisms play different roles 
in learning. 

We assume that circuits that result in a sub-optimal solution relative to 
the rest of the circuits not only receive less reward, but also are more likely 
to be ‘overwritten’ by transmitting the information in the form of synaptic 
weights and structure from other local complexes. During this transmission 
process, small variations are introduced to the new circuit through SSP. 
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Iterating this mechanism results in the increase in the representation of the 
circuit that gives the best solution, gradually replacing other circuits until 
no better variants are further produced, and finally (and ideally) a solution 
is found. Our central aim is to understand how different neuronal complexes 
might evaluate possible solutions in parallel and thus compete to converge 
to an optimal result during learning (Fig. [Tj) . For this, we put together all 
these verbal ideas into a quantitative framework. 

We study the properties that need to be sought in order to understand 
more accurately how learning occurs. For this reason we build up from lo¬ 
cal mechanisms of neural learning. That is, we set our problem at a time 
scale that allows us follow whether neurons are found to be on or off. Each 
neuron is assumed to fire stochastically, but with a probability given by the 
input activity of other neurons in the complex. We will assume reinforcement 
learning, and as other works, employ simple measures such as distance be¬ 
tween the output and the target. We emphasize that this is analogous to the 


gradient of a fitness landscape in evolution 11 . This analogy will allow us 
to tackle the problem with full force, partly by employing the mathematical 
models developed in evolutionary biology. 

Despite the high level of abstraction of our approach, we acknowledge 
that an ultimate verification of our hypothesis needs to come from experimen¬ 
tal neuroscience. However at the moment we intentionally avoid discussing 
molecular or physiological aspects, which although essential to understand 
the problem experimentally, at this point would simply obscure understand¬ 
ing what we propose are the strategic means through which the brain works 
at the level we aim to describe. 


1.1 Analogy with Darwinian Evolution 

As stated above, so-called Neural Darwinism does not account for the gener¬ 
ation of post-developmental variation repeatedly in circuitry, on which selec¬ 
tive mechanisms could act. However, this is still not enough for a complete 
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(A) Distributed 
Input 


(B) Parallel processing 


(C) Output 



Figure 1: Replicative neurodynamics. 

(A) The input is fed into several local neuronal groups. (B) Each of these 
groups evaluate the input independently, thus trying in parallel distinct spik¬ 
ing patterns (represented by neurons in white and grey states), and (C) pro¬ 
ducing distinct outputs with corresponding reward/fitness values W. (D) 
Groups that result in higher fitness transmit their synaptic weights to other 
groups that performed poorly (connections amongst groups are assumed to 
exist but are not displayed on the figure, and not explicitly modelled). This 
parallel evaluation is repeated until an optimal solution spreads across all 
groups. 
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implementation of adaptive evolution in the functioning of the brain. What 
is missing is an interpretation of heredity in terms of neurophysiology, so 
that the selected variants can be expanded and, from them further variants 
be generated. In this way the interaction between selection, variability and 
heredity can find the right spiking patterns to solve a problem jT] . 

The mechanisms for generating variability of neural spiking patterns are 
relatively simple to rationalize, and there are many works in the literature 
that take this aspect as modelling objective 12 . But it is less obvious, 


of deeper implications and of far-reaching consequences, to realise that a 
mechanism of ‘neuronal heredity’ between local complexes might exist. 

As explained above, for neuronal heredity to occur it is necessary that 
several local networks can act in parallel; stochastic variation in spiking is 
not enough. Heredity occurs when circuits that have reached satisfactory so¬ 
lutions transmit their contents to some other circuits that did not perform as 
well (Fig. [Tj) . Although there is no replication of the population of neurons 
per se (as in a biological population), these repeated rounds of evaluation 
and replacement implement a mechanism of heredity |7,13 that is analogous 
to genetic inheritance. Admittedly, models of ‘neuronal replication’ thus far 
have relied too much on accurate topographic mapping between local net¬ 
works. In general this assumption is unjustified. We hasten to note that 
a neurobiologically realistic solution to this problem is already in our hands 
and will be subject of forthcoming publications. We regard our present paper 
a ‘formal’ one partly because we treat the component process of replication 
as a black box of which the content will be revealed later. Thus we perform 
an abstract analysis of evolutionary neurodynamics by linking basic theory 
in neuroscience and evolutionary biology under the assumption that neuronal 
heredity is solved. Note that the discussions on two mechanisms of accumu¬ 
lating knowledge (by evolution and by learning) have been largely isolated 
from each other. These two sides of the discussion are not exclusive. We of 
course recognize that spiking neurons, Hebbian learning and SSP exist, and 
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are central components of cognition, but we argue that on their own, they 
do not suffice for explaining how complex tasks are solved. 

In this paper we merge these concepts formally and investigate how learn¬ 
ing and Darwinian selection can work together to drive the system to optimal 
solutions. Our approach is akin to models coupling learning and selection 
by directly rewarding over stimuli (cf. the direct actor; 14 , pp. 344-346), 


which, as in our evolutionary approach, results in maximising the average 
reward (i.e. fitness). However, our work has a wider scope since it also 
merges other ideas such as neuronal copying and structural synaptic plas¬ 
ticity (though see Discussion). We take a mathematical and computational 
approach to understand the principal factors that are relevant and isolate 
aspects that are in principle falsifiable. We show that for eyes educated in 
evolutionary biology, the equations that describe the whole process are as¬ 
tonishingly similar to the mutation-selection equations, albeit with a twist. 
The quantity that is analogous to mutation rates is not constant since it 
depends on the state of the whole system. The relevance of this difference 
is that such ‘mutation rates’ derive from the associative (Hebbian) learning 
mechanism, where the synaptic weights represent the strength of the associ¬ 
ations, effectively represented by a graph with weighted edges. (This would 
be equivalent to mutation rates that depend on the extant population diver¬ 
sity.) When coupled to selection, the learning system is more efficient than 
simple hill-climbing. This is our first central result. Second, we will consider 
asymmetric landscapes and show that the learning weights correlate with the 
fitness gradient. That is, the neuronal complexes learn the local properties 
of the fitness landscape, resulting in the generation of variability directed to¬ 
wards the direction of fitness increase, as if mutations in a genetic pool were 
drawn such that they would increase reproductive success. Third, we study 
how this mechanism is efficient in reaching optimal solutions in rugged land¬ 
scapes. We show that the system often reaches sub-optimal solutions when 
we assume random networks of synaptic connections. We identify these as 



impasse states. That is, the network reaches a sub-optimal state where each 
possible modification of the synaptic and spiking activity only decreases the 
quality of the solution. However, when synaptic plasticity is considered, these 
states are easily overcome. For this dynamics, we characterize the distribu¬ 
tion of lifetimes of synaptic connections. This is a central aspect because it 
is a quantity that can be measured experimentally. Although at this stage 
we are not concerned with parameter estimation or inference, we do point 
out that our results are in principle falsifiable. 

2 Models 

We note that on short time scales (milliseconds) spikes take place and the 
selective dynamics can act by rewarding different sub-networks of the neu¬ 
ronal circuit. Yet, variation in spiking can be produced due to changes in 
synaptic weights. On a larger time scale, SSP generates novel circuitry. For 
simplicity we separate these two time scales. We first describe the joint ac¬ 
tion of selection in several groups and learning. For now we assume that all 
groups have the same topology of connections, but each one has a different 
spiking pattern; we describe SSP below. 

2.1 Learning in parallel groups 

In the spiking models, learning occurs by updating the weights that determine 
the probability that a neuron fires. This update follows Hebb’s rule, verbally 
stated as ‘neurons that fire together, wire together’. Hebb’s rule has been 
modelled with fixed connection topology where the weights are allowed to 
change according to the covariance amongst neurons, as for example [15] : 

A = XXjYj (1) 
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Table 1: Analogy between the concepts in evolutionary genetics 
and neurodynamics. 


Evolutionary genetics 

Neurodynamics 

Symbol 

Loci/genes 

Neurons 


Nr. of loci 

Nr. of Neurons in a group 

n 

Alleles 

Neuron state (on/off) 

X 

Allele frequency 

Firing probability 

P 

Population 

Groups of neurons 

N(— oo) 

Adaptive landscape 

Rewarding mechanism, score 

W 

Mutation rate 

Switching probability 

A, M 

— 

Hebbian weights 

0 

— 

Learning rate 

A 

— 

Synaptic cost 

k 


where A is the learning rate, X — 1 if the neuron fires (on) and A" = —1 if it 
does not (off), Y k = JA</> fc jAj is the output of neuron fc, and </y,-is the weight 
between neuron i and j. (Note: in the neuroscience literature, weights are 
denoted by w, however this notation is potentially confusing in the context 
of evolutionary analyses because a similar symbol is employed for fitness; see 
Table [I]). 

Hebb’s rule is problematic because it allows weights to increase unbound¬ 
edly. Thus, for computational convenience, we employ Oja’s rule, which is a 
version of Hebb’s rule with normalised weights: 

A(j)ij — AYj(Aj — <pijYj) . (2) 

Oja’s rule ensures that the weights are normalised, in this case with Eucledian 
norm, so that JA 4>f k = 1. 

Whether any one neuron spikes or not is assumed to be a random event. 
The probability with which neurons change state (switch on or off) is given 
by an update rule A that depends on the state of the input neurons and their 
weights. Thus, the probability that a neuron i is on, P[X t = 1 ]= Pi is given 
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by the master equation 



( 3 ) 


where H° n = Pr[X, : : 0 —>- 1] and kL° ff = Pr[X ?; : 1 —>- 0] are the probabilities 
that inactive neurons spike and spiking neurons shut dow, respectively. We 
assume that the update rule takes into consideration the state of both the fo¬ 
cal neuron and the rest of the neurons in the group at the previous evaluation 
round. We also assume a time scale that is larger than the refractory period, 
so that that spiking is only affected by the previous state of the network. 

Some models, such as Boltzmann machines (see jl4|), assume that the 
neuron itself has no memory, and its spiking probability is independent from 
previous neuronal states. In this case H° n = Pr[JQ = 1|X] = 1 — Pr[X = 
01X]; the master equation gives 



(4) 


and the ‘on’ probability is given by 


iC = [1 + exp(-Yi)]- 1 • 


(5) 


For firing neurons the weights 0 increase, so A° n tends to grow to 1. 
Hence, co-spiking neurons become more likely to fire in each evaluation 
round. 

However, we take a different approach by assuming that learning can 
be modulated more efficiently by allowing A° n and H° ff to have an effect 
on the network. Note that this description of learning is coarse-grained: it 
only tracks how often a neuron tends to be on as learning proceeds. This 
is a different view than that of machine learning, where neural networks are 
trained by a set of examples from which the weights are inferred. Then, from 
this inference the model can be used to predict or classify data that were not 
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included in the training set. Our goal in this paper is different: we consider 
parallel networks that try to solve a specific problem. 

Solving a particular problem requires some comparison between the tar¬ 
get, T and the output or solution fl For this purpose, we can employ the 
square deviation, A 2 = (Q — T ) 2 which we want to minimize. We assume 
that T is a given parameter and is the output evaluation of the network. 
Each network presents an alternative solution, thus having a different devia¬ 
tion A from the target. Hence, we minimize the mean value of the deviation, 
A 2 = E [(ff — T ) 2 ]. Under a proper scheme of neuronal network replication, 
this minimization amounts to Darwinian selection. This selective scheme is 
the following: First, each local network is weighted according to its fitness, 
given by W — exp(—/3A 2 ]. Second, groups that have larger fitness are kept. 
Third, networks with lower fitness are overwritten with the content (spiking 
and/or weight states) of the groups with large fitness. (There are several 
ways in which this copying can be implemented: this is the black box part 
as explained above.). Since in the present model we assume that there are 
infinitely many groups, replacement need not be done explicitly: we simply 
consider the proportions of groups (this in order to have a direct link to clas¬ 
sical population genetics models that assume infinite population size). Since 
we assume that copying is random across different neuronal loci, then the 
proportion of a group with a specific configuration is simply the product of 
the probability of the state of each neuronal locus (this is analogous to the 
Hardy-Weinberg assumption of population genetics; (TT| pp. 34-39). Mathe¬ 
matically, we track the proportions, p, of active neurons and the distribution 
of weights across groups. For the former: 


dpi 

dt 


Pi{ 1 - Pi) 


a log (W) 

dpi 


+ Mi(2pi — 1 ) 


( 6 ) 


where IT) is the fitness, and IT = JA piWi the mean fitness. The first term 
is well known to evolutionary biologists and computer scientists: it describes 
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the process of hill-climbing in the direction of fitness increase 11 . Note that 


we can approximate log W = A 2 + var(A). The second term represents the 
variability that is generated throughout learning. For simplicity, we assumed 
that the switching probability is symmetric (Mj = A° fi = 1 — A° n ), and which 
is given by the activity rule 


Mi = 


1 + e Yi 


(7) 


where Y t = JT (ftijXj is the activity or current of the the input neurons, and 
4>ijaie the weights determining the associations amongst neurons, and which 
evolve according to Oja’s rule. As more spiking neurons are connected the 
activity of the focal neuron increases and its switching probability decreases 
asymptotically to zero. Whether a neuron stays on or off however depends 
non-trivially on the collective success of reaching the target T. 

Appendix [A] shows that to first-order approximation we can track only 
the mean weight at every synapse and apply a general learning rule to all the 
average activities of the ensemble of groups. That is, we approximate that 
each network has, on average, input activity Y t = 4>ij(2pj — 1)- This still 
allows each network to have a different spiking pattern from those of other 
groups. We will see below that even under these simplifying assumptions, 
evolution has a dramatic effect by accelerating convergence to maximum 
fitness (or minimum A 2 ). We will assume small initial values of the synaptic 
weights. Moreover, the variance of these becomes increasingly small as the 
neuronal complexes converge to a solution. Thus, we will make no further 
distinction between (ft and (ft. 

Summarizing, for n neurons and k synapses amongst them we work with 
n + k ordinary differential equations. Initial conditions are assumed to be 
random and the topology of the network is assumed to be fixed during each 
learning round. 
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2.2 Information content of a synapse 


How to measure the information content of a synapse is not obvious. First, 
we point out that it is important to discern between local measures, such 
information in a particular synapse, or global information in the whole cir¬ 


cuitry of a complex, or even across complexes 16 . Different choices depend 


on the specific purpose, and can be made according to distinct epistemologi¬ 
cal views and/or experimental purposes. For the aim of this article we choose 
one of the simplest measures, mutual information, H , because it describes 
the interdependency amongst two specific neurons in the context of a specific 
complex. Mutual information is defined as 


Hv = 


Y Pr [Xi = r\Xj = s\pj log 


r,se{0,l} 


Pr[X, = r\Xj = s] 


Pi 


A. 


( 8 ) 


To calculate the conditional probabilities we first evaluate the input ac¬ 
tivity Ej\j of neuron i by fixing the value of neuron j to pj = 0 or 1. This gives 
a conditioned value of the switching probability, Then, the solution to 
Eq. [6j using the conditioned switching probability M t \ v gives the desired 
conditioned probabilities. In this case we keep the weights constant because 
we are only assessing the information capacity of the specific synapse and not 
the information capacity of the whole network. The exact expression of H is 
derived in the Appendix [Bj where we also show that for Gaussian selective 
landscapes information is approximately: 


Hi] = 6o/ ; .\/,.!/, ■ (9) 

Mutual information quantifies how likely it is that, if one neuron spikes, 
the other one will also do so. If the neurons are not connected, faj = 0 
implying Hij = 0. If the focal neuron i spikes randomly (large M ) then the 
information content is low (in that case <f>ij is expected to be close to zero). 
Since 0 < \4>ij\ < 1, the quadratic term dominates over M, which makes H\j 
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proportional to <pfy Hence, the information of a synapse increases as the 
weight increases. Note that for a given switching probability, the learning 
weights are higher in sparse networks than in fully connected ones. Thus, to 
an extent, the former encode more information than the later. 

2.3 Structural synaptic plasticity 

We implement SSP on a time scale different from that of associative learning. 
The system described above is in terms of statistical averages, and can be 
regarded as conditioned on a given network of connections. We assume that 
synaptic connectivity changes occur in one arbitrary group (below we explain 
how changes are introduced). If the new topology improves fitness, it spreads 
across all groups. However, we allow for a random component to avoid 
trapping in local optima, where the network is stranded in a state of impasse. 
For simplicity this is implemented through a Metropolis-Hastings algorithm. 
That is, if fitness increases with the new topology, this spreads to all groups. 
If fitness remains unchanged or decreases, then the layer might spread with 
probability exp(H / net0 — Wm). Allowing for this fitness decrease facilitates 
the escape from states of impasse. 

We assume that changes in synaptic structure follow two heuristic rules 
inspired from neuroscience. First, if two neurons are unconnected but they 
are highly likely to spike, then a new synapse amongst them can be intro¬ 
duced. There is evidence that synaptic rearrangements result from circuit 
rewiring upon (e.g. in neocortical pyramidal neurons) stimulation 
Algorithmically, we randomly choose pairs of neurons with probability pipj 
amongst the set of unconnected pairs of neurons. Second, we allow existing 
synapses to be disconnected randomly with probability 

R = exp[— aH] , (10) 

where H is the synaptic information (Eq. [9] That is, if a synapse is in- 
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formative, then it is unlikely to be disconnected, whereas if it contains no 
information, it is likely to be disconnected [3,20,21]. 

In addition, we perform a Metropolis acceptance/rejection system: if the 
change in the network structure results in fitness increase, it is accepted. If 
on the contrary, the change results in fitness decrease, it is accepted with a 
probability proportional to the fitness ratio between that of the new network 
to that of the old one. 


3 Results 

3.1 Selection and learning together speed up finding 
solutions 

To understand how learning and selection jointly act we first assume an el¬ 
ementary scenario, i.e. a simple hill climbing process where we target for 
all neurons to be on. This situation results from a fitness function given by 
W = exp[5 JT p,], which has a constant gradient, d p \ogW = S for every 
neuron. (This can be seen as a limit where T is far from the current state, 
thus S = 2 f3T). We assume that Hebbian learning is slower than selection; 
i.e. A < S. This regime describes the coupling of rounds of learning with 
copying across groups. Otherwise, learning would be completed indepen¬ 
dently in each group, associating random spikes and would not be able to 
learn the relevant features of the landscape. The associations that the net¬ 
work makes would relate back to previous states, effectively acting against 
hill climbing. However, in the regime where learning is slower than selection, 
fitness increases the representation of solutions, and once these are stabilised, 
learning can create meaningful associations. Figure [2] presents a typical out¬ 
come where the process is characterized by three stages. We first observe an 
initial exponential increase in the proportion of active neurons. Compared 
with systems that do not learn the dynamics are similar on short time scales. 
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Figure 2: Example of selection-learning dynamics. 

(A) Selection-learning dynamics (black lines) compared to standard 
mutation-selection with naive transition rates (M ~ 1/2; bine) and to the 
run with the transition rates already learnt (red). Inset: evolution of fitness. 

(B) Evolution of the transition rates. Inset: evolution of Hebbian weights. 
n = 20, S = 5, A = 0.001. Initial conditions for allele frequencies and for 
initial weights are randomly sampled from a uniform distribution I/[0,0.01]. 
The learning network is fully connected. Note the log-time scale. 


This is because selection increases the representation of groups that provide 
better solutions, but these are initially in very low proportions. These fitter 
solutions are simply products of lucky stochastic events. Initially there is 
hardly any learning, indicated by light weights, and selective expansion sim¬ 
ply amplihes those groups that have higher activity. This amplification takes 
on the order of 1/S rounds of evaluation. As groups become selected learn¬ 
ing takes over, entering an incubation period where associations are built up 
because a good proportion of neurons fire correctly. After a learning period, 
associations are fully strengthened and the solution is finally reached whereby 
switching probabilities reach a minimum (Fig. |2]B). The width of the plateau 
has a duration of roughly y — y- This regime is noticeable on a log-scale. 
Although in absolute time the selective process is so quick that it might pass 
unnoticed, this selection stage is crucial to explore configurations that can 
be fixed through learning. We emphasize that this early stage corresponds 
to the selective stabilization in the Neural Darwinism theory. 
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Instead of favouring equally all neurons to spike, the landscape can be 
set to favour distinct neurons to fire preferentially over others. For instance, 
making W = exp[JA SiPi] and allowing S t to take any arbitrary value intro¬ 
duces asymmetries to the landscape. Crucially, if the dynamics are re-run 
with the learnt weights, the equilibrium is reached order of magnitudes faster. 
We stress that this is true even if initial spiking probabilities are randomised 
(Appendix [C]). 

Importantly, the switching probabilities become strongly correlated with 
the fitness gradient, meaning that the associative weights learn the local 
properties of the landscape. Figure [3] shows that scaling the relative switching 
probabilities as M. i = where M m is the largest M of all the neuronal 

loci, then there is a universal behaviour with S = \Jn — 1 S/Mi of the form: 

M = e~~S . (11) 

This is an approximate form which breaks down at very low values of S (see 
Appeiidix[T)]). Intuitively, we would expect that larger values of S would result 
in small switching probabilities. What happens is that the coupling of the 
system is sensitive to the distribution of S. For example, if all the selective 
values are similar (i.e. the variance is low) then the corresponding M’s are 
all low (e.g. red curves in the inset of Fig. [3]). If the S 's are distributed in a 
wider range, then the M’s are also distributed more widely, but have, overall 
larger values (e.g. black curves in the inset of Fig. [3]). In other words, for a 
given mean value of S, the average M increases with var(S). In fact, there 
is a positive correlation between the variance of S and the mean switching 
probability (data not shown). In turn, for a given variance of S, M decrease 
with the mean of S. This means that most efficient complexes are composed 
by neurons that are required to fire more specifically. 

For a given system, the associative weights increase (asymptotically) with 
the strength of selection (data not shown). We performed performed Spear¬ 
man’s ranked correlation test to measure the strength of the dependency. 
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Figure 3: Relationship between selection intensity and associative 
weights. 

By scaling the switching probabilities with the maximum M of a network, 
and scaling selection intensity as S — \Jn — 1 S/M, we obtain a universal 
behaviour (black line). The distinct symbols represent runs with (squares) 
n = 5, (circles) n = 10 and (diamonds) n = 15 neuronal loci. For each n we 
report 300 independent runs. The selective values at each locus are chosen 
independently from a U[ 0 , 10]. A = 0.01,p 0 £/[O,O.l],0 o C[0,0.01]. 


(Because of the non-linearity, ‘standard’ Pearson’s correlation is not a good 
measure for the dependency between S and 0.) In absolutely all cases the p- 
values were numerically zero, indicating strong dependence amongst S and 0 
(data not shown). This strong statistical support indicates that the synapses 
encode the fitness gradient, directing variant spiking patterns accordingly: 
strong selection results in strong weights, which in turn decrease the switch¬ 
ing probability. This leads in minimal variability of spiking, which maximises 
speed of fitness increase. Conversely, weak selection leads to poor associa¬ 
tions resulting in large spiking variability, which allows exploration of the 
landscape. 

We note the learnt equilibrium point is independent of the learning rate 
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A. This turns out to be generally true, regardless of the fitness landscape. 
We also note that under these ‘directional landscapes’ the initial conditions 
(of both weights and allele frequencies) do not affect the equilibrium state 
of the system. (However, later we will see that under more complex fitness 
landscapes this is not true.) 

3.2 Formal analogy between evolutionary dynamics and 
neurodynamics 

At this point we formalise further the analogy with evolutionary biology, and 
more specifically with population genetics. First we realise that the bimodal 
neuron model is analogous to a biallclic genetic system. We start by clarifying 
a small but crucial difference in the notation. While in the models considered 
in this paper neurons take states { — 1, +1}, in population genetics alleles are 
typically denoted as (0,1}. The +/- notation is convenient mathematically 
in order to describe Hebb’s rules, thus in our evolutionary analogy we also 
require this property. Hence if G is the value of a gene or allele, then we 
define X = 2 G — 1. In this way we can readily apply the machinery from 
evolution to neuronal networks. 

Second, we consider the spiking probability of a neuronal locus (node) 
across all groups (Fig. [l]). This average, which is the probability Pr(AA) that 
a neuronal locus i fires in some of the groups, is thus analogous to the average 
E[2Gi — 1] = 2 pi — 1, where pi are allele frequencies at locus i. Note that 
allele frequencies are interpreted as the probability of sampling a particular 
allele in the population. Thus, for the analogy to be consistent, population 
size needs to be analogous to the number of groups involved in the learning. 
Although in both populations of individuals and of neuronal groups numbers 
are in fact finite, in this work we consider, as a first approximation, an 
infinite number. In this way we do not need to worry about stochastic effects 
that complicate the analyses. However, we recognise that randomness due 
to finite population size (a.k.a. genetic drift) can play a crucial role in both 
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evolution and in learning. This is because randomness facilitates escaping 
local peaks and exploring the landscape in a less constrained manner. But 
before taking stochastic factors into consideration we want to focus on the 
interaction between selection and learning in the infinite population model. 

Third, upon reproduction a population generates a new set of individuals, 
which sooner or later replaces the parental population. However, in neuro¬ 
dynamics, reproduction has to be interpreted in a particular way, because 
there is no generation of a new set of neuronal groups. However, the selec¬ 
tive copying into groups with inferior performance effectively corresponds to 
a new population of groups (Fig. 03)- 

Given the analogies above, we can ask the converse question: what is the 
interpretation of the learning process in evolutionary dynamics? 

Equation [3] describes the activity changes of neural networks across it¬ 
erations, leading to an update rule of the spiking frequency of each neuron. 
In population genetics, this transition probability corresponds to a mutation 
rate. In molecular evolution mutation rates are normally state-independent, 
dictated by, for example, copying errors of the polymerases that replicate 
DNA, repair mechanisms, or other molecular processes that do not depend 
on the genetic states of the individual or population. (Although there are 
genetic models that consider evolvable mutation rates; see Discussion.) The 
switching probability Mj = 1/(1 + exp[Y)]) is dependent on the state of the 
system, and follows directly from the update rule. Apart from this depen¬ 
dency, the equations (Eq. [6j) are analogous to a selection-mutation equation. 
The resemblance is a natural outcome from the analogy laid out above. 

But beyond the cosmetic similarity between the replicator-mutator equa¬ 
tion and neural dynamics, the crucial difference is that the update rule is 
able to learn the local properties of the fitness landscape. By doing so, hill 
climbing towards a fitness peak is facilitated by generating variation directed 
towards the fitness increase. 
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Figure 4: Neurodynamics in a stabilizing rugged landscape. 

Model parameters: T = 15, otherwise as in Fig. [2] 


3.3 Learning in rugged landscapes 

We now consider the more complex adaptive landscape, given by W — 
exp [—/3A 2 ]. In evolution this kind of landscapes are known as ‘stabilizing se¬ 
lection’. The complexity of this landscape results from the non-linear effects 
(known as epistasis in genetics and evolution). These are hard landscapes 
to explore because there are many local peaks or solutions, some equally 
optimal, some sub-optimal, and simple hill-climbing algorithms often fail to 
converge to an absolute maximum of fitness. 

Figure [4] shows the neurodynamics. We find that exactly 15 neurons fire 
(with probability p = 0.995) and the remaining five remain off. In this case 
the uninformative neurons are shut down. Which neurons spike and which do 
not is contingent on the initial conditions, but in this landscape the identity 
of each neuronal locus is meaningless. Different initial conditions can lead to 
different but equivalent solutions (data not shown). 


3.4 Random and sparse topologies of the neuronal con¬ 
nections impair learning 

So far we assumed that there are synapses amongst all pairs of neurons. Re¬ 
laxing that assumption corresponds mathematically to fixing certain weights 
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(f>ij to zero, indicating that no synapse exists among neurons i and j. Un¬ 
der these circumstances the equilibrium switching and spiking probabilities 
are more variable, with the spread determined by the connectivity of the 
underlying learning network. Appendix [E] presents some neurodynamic out¬ 
comes using different random topologies under directional and stabilising 
landscapes. These topologies are drawn from different random graph models 
with various degrees (see Methods). We tried Erdos-Renyi, Barabasi-Albert 
(scale free) and Watts-Strogatz small world topologies. Each of these mod¬ 
els has different statistical properties. Irrespective of these, there are two 
central conclusions. First, random networks lead to unfit solutions, where 
the systems cannot reach the target. This is true regardless the target value, 
number of neurons and type of topology. The systems typically converge to a 
suboptimal solution where no further learning can happen and cannot escape 
local optima. We regard this as a situation where a network that was previ¬ 
ously functional for another task is repurposed for a new task, and the initial 
topology is, regarding to the new task, arbitrary. Thus, the initial circuit is 
not expected to be adapted to the new task. Consequently, what the system 
can learn is only limited, and in the vast majority of cases, suboptimal. We 
identify these solutions as states of impasse, which means there is no further 
progress possible, because any small modification to the synaptic weights or 
spiking patterns leads to a state with lower fitness score. 

The second central conclusion is that poorly connected neurons have very 
low input activity, leading to high switching probabilities. Highly connected 
nodes have small switching probabilities with spiking frequencies close to 
unity. Hence, only highly connected nodes (the less frequent) can learn ef¬ 
ficiently. Since random topologies give suboptimal results, we consider that 
details regarding specific network distributions are secondary and discuss 
them only in the Appendix [E} 

Our choice of network distributions is arbitrary, motivated by mathemat¬ 
ical convenience and scientific hype. Hence, the results above do not nec- 
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essarily imply that brains are suboptimal unless fully connected. However, 
they reveal that in sparse networks, as in real neuronal complexes, Hebbian 
learning does not suffice to solve complex problems because it too often leads 
to impasses. 

3.5 Structural Synaptic Plasticity 

Structural synaptic plasticity is a mechanism that goes beyond the update of 
existing synaptic weights (i.e. Hebbian learning) by allowing new synapses to 
be established and old ones eliminated. This dynamical restructuring of the 
topology of the neuronal networks as the system learns has been shown to be 
important for the transfer of short-term to long-term memory [8]. However, 
we test the role of SSP in the more general scenario of problem and impasse 
solving. 

Above we found that network topology impairs problem solving on com¬ 
plex learning landscapes. This is paradoxical because there is clear evidence 
that the brain is not fully connected, even though the type of connectivity 
is disputed and tissue-dependent. But our negative results do not rule out 
that there might be specific topologies that facilitate or optimize learning. 
We now show that under synaptic plasticity, the neuronal complexes form 
particular structures, which are unlikely to be recovered randomly, and thus 
accounts for the negative results above. 

A mechanistic description of SSP rooted in neurophysiological processes 
is beyond the scope of this article, in part because much is unknown. Instead, 
we propose a simple phenomenological model to show that SSP can not only 
affect the outcome of the learning process fundamentally, but also that this 
mechanism can resolve impasses. Further specific aspects keep the essential 
features unchanged, even though some of these might be crucial for the actual 
implementation of the cellular mechanisms that we explain using a simple 
model. 

A central feature of SSP in the neurodynamics is that, by modifying the 
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distribution of synapses, it provides new pathways to explore the space of 
solutions. This is why SPP can be an efficient mechanism to escape impasses. 

3.6 Dependency on the number of neurons 

This random mechanism allows the neuronal networks to explore the space 
of configurations, leading, on average, to an increase of fitness. Figure [5] 
shows that systems with few neurons evolve good solutions more easily than 
larger systems. This is clear: finding an optimal configuration with a few 
neurons requires fewer evaluations than larger networks simply because the 
search space of the former is much smaller than that of the latter. The 
number of possible configurations increases with n 2 , thus on the basis of 
trying one modification at a time the convergence time increases non-linearly 
with the number of neurons. However, although this holds true for our 
model, there is no reason to think that there cannot be parallel evaluations 
of different topologies in different complexes, dramatically alleviating this 
inefficiency. However, in this paper we restrict ourselves to evaluating one 
modification per iteration. Note that one step in the iteration does not 
correspond to a physiological time unit because the Metropolis algorithm only 
ensures convergence to the equilibrium distribution as dictated by detailed 
balance and considers no information regarding the diffusion leading to said 
equilibrium. 

In each round of learning (i.e. after the system converged to equilibrium 
with a newly tested network) the current weights are being kept, and new 
connections are assigned to new random initial values. Alternatively, we can 
simply reset all weights to random values. The second strategy proves to be 
more efficient than the first. Whether spiking probabilities are reset or not, 
proved irrelevant (data not shown). 
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Iteration 


Figure 5: Dynamics of structural synaptic plasticity. 

Example of a random realization with n = 30 neuronal loci (fitness is scaled 
to the maximum value). Inset: Absolute fitness as a function of neuronal 
loci. Parameters: T — 7; otherwise as in Fig. [2j Synapses are assumed to 
have no cost. 

3.7 Structural plasticity leads to maximal connected¬ 
ness 

The resulting synaptic networks are straightforward. Recall that our test 
problem chooses for a target number T of spiking neurons. Thus, the optimal 
state has exactly T neurons on and the rest are off. Ideally, these T neurons 
are fully connected amongst them. We find that the complexes correctly 
converge to solve the problems, and, thanks to SSP, the networks that evolve 
fully connect these active neurons (Fig. [6j) . In other words, the systems 
converge to networks that fully connect the required components to solve 
the problem. 

The convergence to fully connected networks is due to two factors. The 
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Figure 6: Evolved learning networks. 

Example of evolved networks with n = 15 (left) and n = 20 (right) neuronal 
loci. The colors indicate the frequency of active neurons. In both cases the 
particular node labels are irrelevant, and the proportion of active neurons 
depend on the initial conditions and the history of the process. Note that 
irrespective of the number of neuronal loci, the number of active components 
is correct. Parameters as in Fig. [5] 

first is the need to switch on the right number of neurons, which requires 
strong synapses amongst them. The second is to switch off the unneeded com¬ 
ponents; this also requires connected components because negative weights 
between active and inactive decrease the probability of bring. If negative 
weights are not allowed, the system can only maximise fitness by ensuring 
the right neurons are on, and the networks converge to fully connect these 
components (Fig. [b]). 

3.8 Neuronal networks are robust to small costs of 
synaptic connections 

Now we penalize for the amount of connections that the networks have (Fig. 
[7]). There are various reasons to assume this constraint. First, there are costs 
associated to synaptogenesis. Second, there are higher metabolic costs due to 
the transmission of action potentials, which increases at least proportionally 
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(if not allometrically) with wiring. Third, there are major spatial constraints 
in the brain, limiting the amount of white matter that can be packed. In 
order to take into account these and other reasons for limiting the amount 
of neurons, we include a fitness cost to the system, exp [—kd\, where k is the 
cost per synapse, and d is the number of synapses (number of edges in the 
learning network). 

In Fig. [7] we observe that finding the solution to a problem is impaired 
as the cost of establishing synapses is increased. (In these examples we 
target T = 7, but the particular choice of the target value is unimportant; in 
the Appendix |F] we present results for other target values.) Clearly, this is 
because the number of connections decreases with increasing cost, which in 
turn compromises spiking specificity. 

It might be unsurprising that the number of synaptic connections de¬ 
creases with their cost, and naturally, the networks become less discrimi¬ 
native as they lose connections. However, it is also true that they show a 
notable level of resilience (graceful degradation), in the sense that even if 
performance is somewhat impaired as synapses are eliminated the required 
number of connected neurons is robust to the cost. In other words as the 
cost increases the networks still converge to structures that connect (even 
if sparsely) the necessary neurons (see Figs. 03). The networks lose perfor¬ 
mance as they lose synapses because neurons receive less input and therefore 
their switching probability becomes higher. Nevertheless, they tend to re¬ 
main connected with as many components as possible. 

In the stationary state the distribution of networks is broad. As indicated 
in Fig. 03 , the average number of synapses decreases with the cost; this is also 
true for its variance (in Appendix[G]we present the degree distributions). The 
meaning is that as the cost increases, each neuron establishes fewer synapses 
with other ones. Also note that some of the components tend to have only 
few connections. This is indicated by the notion of connectivity (Fig. 03 ), 
which is the number of synapses that we need to remove to separate the 


network into two unconnected subsets. Typically in the evolved networks 
this is due to a single poorly connected network, rather than to connected 
sub-complexes interconnected by a few synapses. As costs are very high 
(k ~ 1), the networks are sparse and have several unconnected components. 

We point out two important differences between random networks and 
the evolved distribution of networks. First, taking the random network as 
a null model (the Erdos-Renyi, is the one that best matches the evolved 
distributions, Appendix |G| we expect a binomial distribution B[n—l,p\. The 
observed distributions are reminiscent of the binomial using the empirical p’s. 
However in all cases we rejected the null hypothesis (y 2 tests, all p-values 
numerically zero); the expected variances are too low. 

Second, despite the variability in the distribution of the evolved networks, 
these solutions are not in states of impasse. With fewer synapses, the input 
activities of the neurons are lower, translating into larger switching proba¬ 
bilities (Fig. (7p). This does not reduce specificity of bring: still the correct 
neurons are more likely to bre in an idiosyncratic manner. However, there 
is more ‘background’ noise due to buctuations. We could say that for larger 
costs, neurons are still accurate, albeit less precise. 

3.9 Distribution of synaptic lifetimes 

Part of the reason why our model shows a large variance in synaptic network 
topologies is that we allow for synapses to continuously be established and 
disbanded. We still ignore the quantitative and dynamical extent of these 
processes, reason why we can only make arbitrary assumptions. Precisely 
this, however, is an important point of falsihcation of our ideas. Irrespective 
on the details regarding the assumed synaptic distribution, the observable 
distribution can lead us to rebne our model. Yet, the relevant aspect here is 
not the details of the predicted distribution (since, as argued above we present 
a simplihed model), but the fact that we predict that there is a distribution 
of synaptic lifetimes at all. For instance, Fig. [8] presents the timelines of 
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synapses. We note that many are short lived and a few live long. This result is 
expected to hold regardless of the specific problem being solved, in as long as 
we do not trace the identity of the neurons that the synapses are connecting. 
Clearly, the number of long-lived synapses must increase with the number 
of neurons in a complex, ffowever, the lifetime distribution should hold 
relatively robust. This is expected because what determines the lifetime of a 
synapse is essentially its local information, and this is independent of other 
properties of the network. For example, increasing the parameter a, which 
controls how weaker connections are penalised, modifies the probability R 
of synaptic disbanding so that the average synaptic lifetime become shorter. 
If however the target or intensity of selection is changed, the effect on the 
distribution of synaptic lifetimes is minimal (Appendix [ 5 ]). 

Our results show that synapses that are established early tend to last 
longer than synapses established in later developmental stages (Fig. in 
This is a natural outcome of the model; we have not imposed any mecha¬ 
nism that presupposes this behaviour purposely. This happens due to his¬ 
torical contingencies. Some of the synapses that are initially established lead 
the network close to a local peak. Afterwards other synapses fine-tune the 
network, increasing fitness in smaller amounts. Only after the network has 
been populated, new connections can substitute the original ones. Figure 
[9p reveals how synaptic lifetime increases with the cost. For newly formed 
synapses, low costs do not show a central trend, implying a certain degree 
of resilience or robustness. In the long run, in the small cost range, synapses 
established later live shorter as the cost increases. For larger cost range, 
synaptic lifetimes tend to increase with cost. However, recall that for costs 
k > 0.5 the networks have low or no connectivity. As the cost increases, the 
network starts to dispense with inhibitory connections. Recall that shutting 
unwanted neurons down only decreases the background noise, but does not 
affect the result of the ‘good’ neurons. Thus, synapses amongst these ‘good’ 
neurons tend to live longer, as they are essential for the system’s performance. 
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Due to the redundancy of optimal solutions occasional major restruc¬ 
turing of the networks are expected to occur during lifetime for two reasons. 
First, once a peak has been found, disbanding the principal connections leads 
to major function impairment (fitness decrease). Second, establishing new 
synapses that are potentially as good as the existing central ones is unlikely 
due to the costs of synaptic connections. However, we do find occasional ma¬ 
jor network restructuration. Because of the strong coupling amongst several 
neurons and synapses, once a major connection is destroyed this can cause 
further impairments by subsequent changes that result in even worse fitness. 
At some point there is a restitution of the system when a new fitness peak is 
approached. This has the qualities of self-organised criticality. However, it 
might also be that the stochastic behaviour allows a few groups to shift from 
sub-optimal solutions to better ones, effectively jumping across fitness peaks. 
The subsequent replication of these successful solutions to other groups can 
result on a full escape from impasse states. (In evolution this process is 


known as shifting balance ; 22,23 


4 Discussion 


4.1 Relationship to previous models 


Using a Bayesian framework, 12 proposed a model that explains aspects 


of cognitive learning in children. In their model, the brain implements a 
Bayesian update algorithm to form theories based on observed data. They 
stress that in contrast to a more reductionist ‘connectionist’ view, Bayesian 
learners explain neuropsychological aspects of the dynamics of learning. In 
their framework, theories map to a multidimensional landscape where well- 
formed theories lie at peaks. The dynamics include learning, but only at as 
a local process. They argue that learning cannot account for the invention of 
new theories, but rather, only modify the degree to which we believe in any 
given theory. That is, learning acts to fine tune the theory around a peak. 
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The proposal of new theories does not happen through learning, but through 
stochastic modification of the existing theories (in a data-independent man¬ 
ner, that is, there is random variation). If a new theory scores better than 


the previous ones, it is adopted with certain probability 24 


This model is similar to ours, particularly in the implementation. How¬ 
ever, there is nothing mystical about this coincidence. What Ullman et al. 
and we describe belongs to the Markov Chain Monte Carlo class of mod¬ 
els, which is a popular methodological toolbox in stochastic processes. One 
important common aspect is that learning alone does not produce any new 
configurations (networks in our case, theories in theirs), but only improves lo¬ 
cal adaptation given the current configuration. Whilst their model describes 
processes occurring at high level of cognition, we describe simpler processes 
at the neurophysiological scale. However, we reach similar conclusions re¬ 
garding the need and limitations of learning in relation to other processes 
that can generate variant configurations that lead to a better performance. 
This coincidence and its consequences (see discussion in Ullman’s paper) are 
preliminary evidence supporting our proposed physiological mechanisms. 

Although Ullman et al. do not discuss the states of impasse these are 
implicit in their models. Also, the explanations they give in terms of ‘theory 
formation’ are to a large extent equivalent to those of our model’s. That 
is, learning a local peak restricted to a given configuration results in weight 
values that always lead to lower scores if any modification is introduced. 
They resort to stochasticity (see below) as a mean to jump across peaks. In 
our case this stochasticity is introduced via synaptic plasticity. There can 
be other sources of stochasticity, which we address below. However, synaptic 
plasticity is a component accounting not only for the necessary randomness to 
escape peaks, but is also known to be an important component of learning. 
The question of what generates circuit diversity remains so far open: we 
propose that SSP can account for this diversity. 

Note also the crucial difference in the search mechanisms in the two mod- 
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els. Kemp and Tenenbum 24 use a greedy search algorithm on stochastically 
generated variation, whereas we adhere to the view of the Darwinian neu¬ 
rodynamics, also known as the neuronal replicator hypothesis 25,26 . The 


point is that on vast combinatorial landscapes, evolutionary search is known 
to produce impressive results. The greedy algorithm works for relatively 
small spaces but for larger spaces more efficient search is needed, as Kemp 


and Tenenbaum 24 acknowledge. It is remarkable that although Ullmann 


et al. 12 explicitly draw a rugged conceptual landscape, the possibility of 
evolutionary search is not mentioned. Interestingly, this can lead to predic¬ 
tions. That is, we expect an inverse correlation between the rate of synaptic 
turnover and the capacity to overcome impasses. For instance, the rate of 
synaptic turnover tends to be slower in adults, who are often less able (or do 
it at a slower rate) to overcome impasses in certain neuropsychological test 
problems than children. It has not escaped our attention that this can also 
be addressed through experimentation in animal models from several points 
of view, from behavioural to neurophysiological. 

Besides stochasticity in circuitry, as addressed in this paper, we iden¬ 
tify at least two more stochastic sources. One has been addressed already 
in the first part of or work. That is, the stochastic nature of neuron bring. 
Particularly in our inbnite layer model, the solution to any posed problem ex¬ 
ists a priori , because possible spiking patterns combinations are represented. 
Clearly, any arbitrary conhguration will occur in an inhnitesimally low pro¬ 
portion. However, any random spiking pattern that is not causally supported 
by a synaptic network is extremely unlikely to be repeated in the following 
rounds of evaluation. Therefore even though it might be selected for one 
round, this layer will be overwritten at a later stage. Thus the stochastic¬ 
ity from neural spiking is not a source of heritable variation across groups. 
Since neurons spike with probabilities given by their input activity the only 
meaningful variation that can be produced is that which follows from mod¬ 
ification of synaptic weights. However, as we have extensively argued, only 
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this provides the substrate for selection to act on a local basis. 

We call attention to a partly related model by Seung [9] invoking ‘he¬ 
donistic synapses’ that would release neurotransmitters stochastically, and 
an immediate reward would either strengthen or weaken them according to 
whether vesicle release of failure preceded reward, respectively. It was noted 
that such randomness in synaptic transmission would play the role of muta¬ 
tions in a Darwinian analogy. Seung also notes that stochasticity in action 


potentials could play a similar role, and that mechanism would be faster 27 


But since ultimately these mechanisms operate on a fixed topology, the lim¬ 
itations without structural plasticity remain. Note that ‘copying’ in our 
mechanism is a fast component, intermediate between spikes and Hebbian 
synaptic plasticity. This is a valid assumption if we assume that copying is 
aided by dedicated adaptations (cf. Adams, 1998). 

Another source of stochasticity can result from the competition amongst 
not an infinite but a relatively small number of groups. In this case the de¬ 
tails of how neuronal copying occurs can be important. Some of the N groups 
that have a solution with superior fitness score have to overwrite some groups 
that have lower score. This cannot be done in a proportional way, as in the 
infinite layer model, partly due to the topographic, rather than merely topo¬ 
logical structure of neuronal networks. Therefore, there is some randomness 
of which groups are overwritten and which ones not. Naturally, the ones 
with higher fitness are more likely to be transcribed to other groups, but 
ultimately the process introduces fluctuations due to finite size. The nature 
of these fluctuations is simply binomial on each neuronal locus (assuming 
that each one is copied independently), introducing a variance of the order 
of p( 1 — p)/N. Thus if there are relatively few groups, the stochasticity is 
strong, allowing for a shift in configurations, avoiding impasses and facilitat¬ 
ing the convergence to the optimum. This might lead us to think that fewer 
groups would be advantageous for this purpose. However, with very few 
groups, this sampling effect overrides selection, no hill climbing can occur, 
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and thus no learning can happen. Hence, it becomes clear that there must be 
a compromise between these factors, or even an optimal layer number that 
ensures learning but facilitate impasses. (However we are of the opinion that 
what limits number of groups are physiological costs, not this trade-off.) 

Summarising, there are three central sources of stochasticity: in spiking, 
in circuitry and random sampling. The first one is important for short-term 
learning, whilst the second and third are important for long-term learning. 


4.2 Gating can act as a selective mechanism 

It could be argued that we do not require invoking layer copying in order 
to solve problems or even impasses. For instance, gating mechanisms could 
account for the selective amplification simply by overweighting the outcome 
of solutions with larger score. In other words, gating can implement selection 
as efficiently as a more complex network to network transmission mechanism. 
This is because instead of increasing the number of groups that produce good 
solutions, a single group with such a solution is rewarded preferentially. 

To visualise the difference we make an analogy. Suppose we are mixing 
several paint colours to reach a given tonality. For this purpose we have a 
stock of basic colours and a set of pipes to a recipient where the colours are 
mixed. Thus we need to pour a given amount a% of base paint A, b% of base 
paint B, etc. Suppose that a > b. We can increase the flow of A over B (i) 
through a single pipe, by adjusting the tap or (ii) we can use identical pipes 
all of which have the same flow, just that for A we use more pipes than for 
B. The former case is analogous to a gating mechanism, whereas the second 
case is analogous to increasing the representation of the groups. 


The direct actor models 14 describe a hill-climbing situation that is most 
consistent with this gating mechanism. This is because the rewards, being di¬ 
rected to neuronal activity, do not require a mechanism of neuronal copying. 
Rather, it is enough for two (or more) competing agents to adjust output 
voltage. This effectively implements hill climbing. In the case of the di- 
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rect actor, gating mechanisms are conspicuously clear: learning acts directly 
on the strength of selection, and the closer a complex is to a solution, the 
stronger selection for it becomes, and the weaker the competing complexes 
become. Although invoking gating seemingly renders the assumption of neu¬ 
ronal replication obsolete, it also poses some issues. First of all, selection 
through gating remains a local learning mechanism that can lead to local fit¬ 
ness peaks resulting on impasses. In fact, if we assume that during copying 
the whole content (not just one or few neuronal loci) is transferred to a new 
layer, the equations that describe gating or copying are the same. Second, if 
we want to consider SSP for long-term learning, then we require invoking a 
mechanism that creates and tests novel circuitry, rather than acting on the 
standing variation of the available strategies. In essence this mechanism has 
to include copying of the group in order to evaluate which circuit is better, 
and then discard the worse. Thus in any case, processes of neuronal copying 
must exist. 

However this criticism does not mean we are rejecting gating as a selec¬ 
tive mean; we just point out that it is not enough to implement the complete 
procedure of evolution-enhanced learning. In fact we embrace the possibility 
that both mechanisms, namely gating and copying, can act together. This 
is not only plausible for finite number of groups, but also an efficient way to 
modulate the signal between selection and random sampling 13,28 , poten¬ 
tially facilitating overcoming impasses. 


4.3 Size of the neuronal complexes 

The complexity of the brain is reflected by the dimension of its constituent 
cells (billions of neurons) and by the intricate number of synapses (on the 
order of trillions). In some way this accounts for the cognitive capacities of 
humans, although how, it is not fully clear. We have presented a hypothesis 
that serves as an organising principle for this complexity. However, we have 
considered systems that employ as few as 10 neuronal loci on each layer. 
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Although this small number is partly motivated by computational easiness, 
there are reasons to think that each layer might not require excessive number 
of neurons. First of all, it is well known that the brain is highly modular, 
with different neuronal complexes allocated to specific functions. We believe 
that some of these modules might be specialized for processing information 
in the way we propose, and thus, are expected to be sub-structured into 
smaller functional complexes, each of which is constituted by a system of 
interconnected groups acting in parallel and competing to solve tasks. Thus, 
the actual number of neurons dedicated to any given task depends on the 
number of groups that are recruited for processing a given input, not just 
on the number of neuronal loci. Second, modular networks also facilitate 
rewiring because finding the right network configuration becomes increasingly 
harder for larger numbers of neurons. Hence, for an efficient implementation 
of SSP, brains might work on a modular way to facilitate rewiring of small 
complexes. Third, most complex tasks are likely to be split into subtasks, 
each of lower complexity employing relatively small circuits. In this divide 
& conquer strategy, these smaller circuits can in turn be included on larger 
complexes to accomplish more elaborate tasks. 

We may additionally argue that neuronal systems can show invariant 
properties, so that larger networks behave in a way similar to smaller net¬ 
works. For instance, distribution of synaptic lifetimes does not depend on 
the size of the system, and is therefore invariant. Although we expect the 
time required to reach a given solution increases with the size of the system, 
we expect this time to scale in a particular way. This scaling may also imply 
that more complex problems need to recruit more groups, although this is 
not a straightforward requirement. 

4.4 The expansion-renormalisation model 

Kilgard proposed a verbal theory based on Darwinian dynamics that ac¬ 
counts also for circuitry variation and stabilization, which he termed the 
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expansion-renormalisation model ||8j. As mentioned in the introduction, the 
ERM assumes the generation of variant circuitry, which results in accelerated 
learning. He correctly points out that previous Darwinian frameworks do not 
take into consideration mechanisms of variation. Kilgard accounts for such 
variation in a verbal model; we have a similar stance regarding these ideas, 
but our approach is a formal one. More specifically, unlike Kilgard’s work, 
we assume specific neuronal rules that account for circuit variability. SSP 
constitutes the basic process that allows circuits to be modified. However, 
resorting to SSP also demands understanding or assuming factors that drive 
SSP with the purpose of circuitry modification. We have assumed two prin¬ 
cipal means for variation of the network structures, namely establishment of 
new synapses and disbanding of old synapses. The former follows a higher 
instance of Hebb’s rule, which simply means that unconnected neurons that 
co-spike can become connected. The specific neurophysiological processes 
that facilitate rewiring amongst two arbitrary neurons are unknown. Re¬ 
garding the second factor of SSP, the disbanding of existing synapses, we 
have introduced a novel idea. That is, we assume that the disbanding prob¬ 
ability decreases with the amount of local information. Moreover, we have 
shown that synaptic information is proportional to the square of the synap¬ 
tic weights. This is an interesting result because it relates the mechanistic 
aspects to the intuitive notion of neuronal function. Together, these two 
mechanisms determine the dynamics of variation after layer replication. 


On this line, Fauth et al. 20 propose and analyse a model for the distri¬ 
bution of synapses between two neurons, by studying the interplay between 
Hebbian learning and SSP in a similar way we have done. They assume a 
constant rate of synaptogenesis for unconnected neurons, unlike the Hebbian- 
like mechanism we employ. Synaptic disbanding occurs with probability 
Pr = p 0 exp(—«</>“•), where p Q ,a and a are positive constants, which is of 
similar form to our’s, R (Eqns. [9] and 10). Although in their case this form 
is not motivated by information content, they do point out that the topology 
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of the network might constitute the basis of information storage, and that 
the role of this storage in memory. The topology, in turn, is determined by 
the balance between synaptogenesis and disbanding. Thus, although they do 
not explicitly assume that disbanding decreases with information capacity of 
a synapse, in the context of our model their model comes down to precisely 
that. 

Kilgard’s ERM hypothesizes that there must be a transient increase of 
circuitry variability (expansion) with a subsequent pruning of sub-optimal 
synapses, reducing the variation (renormalization). We have not seen evi¬ 
dence for this. Rather, we find an increase of circuitry variability, with an 
eventual stabilization. The Monte Carlo implementation allows for persis¬ 
tent fluctuations and occasional ‘avalanches’, which afterwards recover and 
re-establish the network functionality. Although the behaviour of Kilgard’s 
model and ours is different, we think that the disparity is superficial. The 
expansion that creates a standing variation that is later reduced might occur 
under specific fitness landscapes and might thus be problem-dependent. In 
our case, since the fitness landscape has many equivalent maxima allowing 
for equally good solutions, there is no force that generates excess variabil¬ 
ity. However, certain types of non-linarites in the fitness landscape can cer¬ 
tainly lead to that behaviour. It maybe that the ‘expansion’ phase of the 
ERM is not a pervasive feature of neurodynamics, but rather, a context and 
stimulus-dependent attribute. However, we must also point out that in our 
implementation we only allow for circuit modifications in one layer at a time 
(coincidentally, as Fauth et al. |20] do). This circuit might be either copied 
to all groups, or discarded altogether. By assuming that many groups can 
develop different circuits in the same evaluation round we might neverthe¬ 
less find the expansion phase. (In fact in evolutionary models that allow for 
high mutation rates there can be a transient increase of genetic variability, 
which is equivalent to the expansion phase of the ERM; e.g. 129]). Finally, 
it may well be true that new synapse formation is adaptive in that its rate 
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is increased by the appearance of novel tasks, for which there is some evi¬ 
dence (T, 30 321. This provocation-based mechanism would easily lead to the 
expansion phase. 


4.5 Towards replicative neurodynamics 

In this article we theoretically test a novel mechanism for neural function. We 
have found a crucial synergy between learning and fitness climbing, strength¬ 
ening previous, related findings 13 . We have used simple models to show 
that the combination of selection and learning is an extremely efficient one. 
Moreover, we also showed the relevance of SSP in this context: modifications 
of the learning topology of networks. However, there is another open possibil¬ 
ity that can result from a combination of learning and plasticity. We showed 
that certain networks are in general terms more efficient learners. Thus, the 
recruitment of an existing, efficient network could in principle lead through 
synaptic plasticity to the copying of its structure in the current network. In 
an analogous way in which DNA is copied, an existing network could replicate 
and such a structure would spread, allowing for problem solving. This has 
been previously proposed as the neuronal replicator hypothesis (13,25,26,28]. 
The idea is very recent, reason why there has still been no experimental ver¬ 
ification. But in this article we advanced the mechanisms that justify the 
neural replicators. It remains open to study how the copying of the network 
topologies can occur. 

Related to the issue of exponential strengthening versus exponential repli¬ 


cation, the path evolution algorithm by Fernando et al. 33 is a remarkable 
suggestion. In that model, neurons along a path are assumed to code for 
some behaviour. Whilst neuronal activity is fixed, paths grow collaterals 
and thus recruit new nodes. Neuronal activity can spread along different 
paths probabilistically that can be evaluated and compared according to 
some performance (fitness) measure. Good paths become strengthened by 
reward, whereas bad ones are weakened. Various paths can have few or many 
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common neurons. This algorithm explicitly incorporates structural plasticity 
and selection and, despite the differences, is thus the closest precedent to our 
model. We improve on that model in two respects: we present a mathemat¬ 
ical framework (as opposed to mere simulations) that takes the first steps 
to unite theory of learning with that of natural selection, and we consider 
recurrent networks that posed a special problem for path evolution. 


4.6 Mutations and recombination as creative sources 


We should call attention to two possible usage of the term ‘mutation’ in the 
neuronal context. One we have seen before: stochasticity in firing or trans¬ 
mitter release. Another one structural plasticity itself: the term ‘synaptic 


mutation’ was coined by Adams 34 in this latter sense, who by the way fore¬ 


saw the potential importance of the phenomenon for the performance of the 


nervous system. Note that 33 in their path evolution model use structural 


plasticity to implement ‘crossover’ between different paths. 

Although visually reminiscent to genetic recombination, the synaptic mu¬ 
tations and the path crossovers are not formally equivalent to DNA crossover. 
In genetics, recombination does not create new allelic variability (on/off prob¬ 
abilities). Instead, it reshuffles the existing variants at any given locus. This 
certainly results in a ‘macromutation’ at the phenotypic level, but the genetic 
variability of the population remains intact. Thus, recombination does not 
increase allelic variation, but it does increase variation across groups. The 
distinction is important in the context of our work: we assume the equivalent 
to high recombination. Namely, at any given neuronal locus, the copying can 
occur from any other group, irrespective of the state of the other neuronal 
loci. This provides the highest rate of reshuffling, and is thus ‘creative’. The 
contrary limit is when only the complete content of a selected group can 
overwrite an out-selected group. This is an ‘asexual’ limit in that there is no 
recombination. The latter provides the fastest selective response, but is less 
creative because it has no combinatorial power. 
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4.7 Relationship to evolvability 

Adam’s synaptic mutations and Fernando’s path-crossover are analogous to 
modifications in the architecture of traits. This is a more powerful type of 
macro-mutation because it truly modifies the decoding of the information 
stored in neuronal states, in an equivalent way as development decodes the 
genetic information into a trait, which is one of the fundamental aspects of 
evolvability. 

Evolvability is understood as the potential of a population to respond to 
selection. How fast the response to selection is depends on the amount of 
genetic (or heritable) variation that can be produced. This can be given by 
standing variation, cryptic variation (due to epistasis, for example), or due 
to mutational variance. Although high mutation rates will provide source 
‘material’ to respond to selection, these will also create load that keeps the 
population maladapted. However, the optimal scenario is achieved if muta¬ 
tion rates can be increased as selection is started, and tuned down once the 
population approaches adaptation. 

As we saw above, this is precisely what happens with the neurodynamics 
we have described. Of course, genetic systems do not have a learning mecha¬ 
nism as the brain does. Nevertheless, these are analogous. We want to bring 
the analogy further and interpret that the input current E as a quantitative 
trait, with the weights (j) taking the role of additive effects. 

A previous model in quantitative genetics has taken an approach reminis¬ 


cent of ours 35 . They did not apply a learning mechanism, but considered 
modifier alleles for the mutational effects. These are selected indirectly, in¬ 
creasing the transition rate in the direction of the highest increase in fitness, 
in a way that is analogous to our switching probability, which generates 
variability in order to improve fitness increase. 
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4.8 Information storage in neuronal circuits 

Understanding the relationship between information capacity and synaptic 
changes is central in order to understand learning, memory and other aspects 
of cognition |3,17|. Under Hebbian learning, information storage relies solely 
on the modification of synaptic weights and is contingent on the existing 
connections. In the SSP scenario, the information capacity of neuronal com¬ 
plexes is adjusted through modification of the connections 21 . Despite the 
advantages of this mechanism in learning, it poses a challenge because of all 
possible synapses that can exist between every pair of neurons, only a small 
fraction is realised, even within a complex. 

Even granting that SPP is a mechanism for the exploration of alternative 
circuits, discovering the right synaptic configurations in this vast combina¬ 
torial space is a major problem. Merely grasping the efficiency of the brain 
in managing multiple tasks of combinatorial complexity demands an under¬ 
standing of the mechanisms behind such capabilities. In other words, how is 
the brain implementing algorithms for searching the combinatorial space of 
solutions? 

Our model introduces means to explore the combinatorial space, imple¬ 
menting mechanisms that are analogous to biological evolution. From this 
perspective, understanding the algorithmic means used by the brain comes 
down to understanding and identifying what constitutes the units of selec¬ 
tion and, crucially, what are the units of variability. This has been one of 
the central questions of our research, which we have clarified by employing 
our analogy with genetic systems. 

Understanding the physiological basis of learning and cognition requires 
the identification of the modifications that occur during learning, and which 
result on specific circuitries directed by different stimuli and experiences. 
However, we point out that there is a fundamental relationship between 
units of learning and units of variability, which we must consider in order 
to understand and identify which are the units of learning. 
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In our model information content is stored not in the activity of neurons, 
but in the synaptic weights and switching probabilities. This has two impor¬ 
tant implications. First, this suggests that the loci of memory are circuits 
(not neurons), even though the mapping between memory loci and cognitive 
functioning might be mediated through the coordinated spiking if individual 
neurons (cf. [101). 

However, most importantly, we have shown that SSP is the process that 
directly mediates establishment and disbanding of synapses resulting in cir¬ 
cuits that represent solutions to specific problems. Previous findings also 


support the notion that information is stored in circuits, not neurons 21,36 


Furthermore, at a higher cognitive level, it has been proposed that conscious¬ 
ness can be gauged through information integration measures between neu¬ 


ronal complexes 16 . At a phenomenological level it is well-known that the 
SSP at the level of both spine growth and modifications of the synaptic net¬ 
works is directly inflected by sensory experience 17,18 , or by manipulating 
neuromodulators fl9j. Despite these lines of evidence, which are compelling 


for our theory, we still require and lack direct experimental verification re¬ 
garding the minimal complexity in the circuit distribution that results from 
solving particular tasks. 

The second important implication is the procedural relationship between 
learning and variability. Even when these two are not the same and they 
constitute fundamentally two different processes, we have shown how learn¬ 
ing fine-tunes the generation of variability. At the synaptic level, Hebbian 
learning modifies switching probabilities, which are the mechanism for gener¬ 
ating variability in spiking. At the level of circuitry, SSP dictates longer-term 
changes, where informative synapses persist and uninformative synapses are 
disbanded. Altogether these two processes mediate the exploration of the 
complex combinatorial space by generating the required variability, guided by 
learning. These are the ‘fuel’ for the motor that results in effective changes, 
which are, ultimately, selection mechanisms. 
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Although the question of neuronal and circuit information storage has 
been extensively discussed, to our knowledge, we are the first to consider 
informational aspects as mechanisms that direct synaptic ‘survival’ and life¬ 
times (see below). By showing that synaptic information between two neu¬ 
rons is proportional to the learning weight and to the switching probabilities, 
we identified a plausible and verifiable mechanism through which SPP can be 
directed. This is crucial because we still ignore the causes and consequences 
of structural plasticity at a cognitive level. Still our hypothesis is supported 
by known facts, chiefly the experiences-driven SSP, where dendritic spine 


growth is observed in vivo in the neocortex of adult rats 18,21,36 


4.9 The lifetime of synapses 

Despite the known potential of neuronal complexes to undergo experience- 


driven synaptic network restructuration 37,38 , we still ignore how frequently 


synaptic connections are modified in adult brains 17 . On the basis that the 


number of synapses remains practically constant during adulthood, it has 
been long argued that the rates of synapse establishment and disbanding 
balance each other. Our result indicates that the length of synaptic lifetimes 
follows a statistical distribution, and the question of synaptic stability is a 
quantitative one rather than a yes/no statement. Once a stationary state has 
been reached, the number of connections remains more or less constant with 
an unchanging distribution of synaptic lifetimes. Even then, circuits are by 
no mean fixed but, rather, show quite a degree of dynamism. 


The SPP model of Fauth et al. 20 studies the equilibrium distribution 


of synaptic connections between two neurons. In their work, as in ours, the 
general shape of the distribution of synapses is robust to model parame¬ 
ters. However, they analyse synaptic stability between two neurons, not the 
general distribution of synapses of a network (whereas they allow multiple 
synapses between neuron pairs, we do not take into account this possibility). 

Through three-dimensional reconstruction of cultured neocortical cells it 
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was determined that 75% of the connections can be explained by assuming 
that pairs of neurons connect randomly. However, the remaining 25% of the 
connections requires invoking function, anatomic differences and positioning, 
amongst others factors that facilitate chemical mechanisms for attraction or 
repulsion in order to complete or avoid the establishment of new synapses 
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Direct experimental information regarding the rates of structural plastic¬ 
ity comes from murine models. Although dendritic spines in the cortex of 
adult brains tend to be rather stable, a proportion of them show considerable 
variation, and experience can induce not only their growth or shrinkage |40 
but also de novo formation or elimination of spines, which, importantly, occur 
at balanced rates. Importantly, these structural spine dynamics are known 
to be similar in distinct parts of the brain 40-42 . Consequently, it is safe 


to assume that the physio-anatomical principles, if not the rates, that main¬ 
tain the distribution of spine size and numbers are similar in distinct cortical 
regions . 

Although direct observation of synaptogenesis is more elusive (mostly due 
to technical limitations), it is known that dendritic spines sometimes result 
in new synapses [43]. Changes in axon terminals occur over several weeks in 


young brains 44 . A compelling example is that of the visual cortex at early 
ages, where synapses established during he developmental critical period are 


stable for more than 13 months 45 


5 Numerical and Simulation Methods 

5.1 Neuronal dynamics 

To simulate the time evolution of a complex of multiple groups (effectively 
infinite in number), where each has a fixed number of n of neurons, we solve 
a system of coupled ordinary differential equations where n of them represent 
the change in spiking probability, and, along with it, a set of differential equa- 
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tions that describe the change in learning weight (i.e. Oja’s rule for Hebbian 
learning). The number of learning equations depends on the connectivity of 
the learning network, which we assume undirected. The initial conditions 
for the spiking equations are random deviates from a uniform distribution 
between zero and one (unless otherwise stated). The initial conditions for the 
learning equations are random deviates from a uniform distribution between 
(0, 0.01], unless otherwise stated. The system of equations are solved numer¬ 
ically for t = 10000 time units. (This number ensures convergence of the 
system for all parameters used, and overall is considered to be on the order 
of ~ 10 ms. Although we could choose S and other parameters such that the 
system is measured in the relevant units, numerically it is more convenient to 
employ the mentioned scale). All simulations were implemented and solved 
in Mathematica 9.0 and/or 10.0. 

5.2 Random networks 

To explore learning network topologies, we generated random graphs from 
three classes of distributions. The first one is the Erdos-Renyi model (ER), 
where nodes are connected randomly. This model assumes a fixed number of 
nodes n and certain probability r that each node is connected to any other 
node. The second model is the Barabasi-Albert (BA), famous for its scale- 
free properties. The BA model also employs two parameters that control 
the network topology: the fixed number of nodes n and number k of vertices 
that are preferentially attached to each node. The third model is the Watts- 
Strogratz (WS), or so called small-world networks, which takes as parameters 
n nodes and a probability r of rewiring a vertex amongst two nodes in such 
a way as to avoid loops. In all cases we forbid multiple edges and self¬ 
connections. These network models are built-in Mathematica , and employed 
as indicated in the software’s Documentation Centre. 
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5.3 Structural synaptic plasticity 


We assume that Hebbian learning happens faster than SSP, leading to a 
separation of time scales. Consequently, we allow the learning equations to 
reach equilibrium, and only then modify the learning network. We consider 
the three modifications of a network indicated in the Model section. First, 
we allow synapses to be formed by adding an edge amongst two unconnected 
nodes i and j with a probability q %3 oc ptpj. In this way, neurons that co¬ 
fire tend to be wired together. Second, we allow synapses to be eliminated 
by randomly removing edges between two connected neurons % and j with 
probability Rjj = exp [—aHij], so that neurons that do not co-fire tend to 
be disconnected. Third, we also allow random rewiring (irrespective of firing 
probabilities) with a small probability u = 0.01: we randomly and uniformly 
choose a connected pair i,j and eliminate the edge, and at the same time 
choose an unconnected pair l,m and establish an edge. In each time step 
any (including all) of the above events are allowed to happen. Once the 
networks have been rewired, a new round of learning is performed. Initial 
conditions may or may not be modified (see Results section). After a new 
equilibrium is reached, the new fitness is compared to the fitness before the 
rewiring. This is implemented through a Metropolis algorithm: if the fitness 
is increased, the change is accepted, but if the fitness decreases, the change 
is accepted with probability proportional to the ratio of new to old fitness. 
We additionally impose a multiplicative fitness cost per synapse of exp[— kd\, 
where k is the penalty of each edge in the network, and d is the number of 
edges of a given network. We typically run the simulations for at least 250 
steps, to ensure convergence. However, in different experiments larger step 
numbers are used, as indicated in the figures. 
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Figure 7: Neuronal systems under costly synapses 

(A) Dynamics of the fitness (relative to the maximum) of neuronal systems 
under different cost per connection. Black curves on top: low costs (k < 
0.05). Colour curves: intermediate costs increasing from k = 0.05 (red) to 
k = 0.5 (blue). Grey curves at the bottom: high costs (k > 0.5). Each 
curve is a replica of 77 independent simulations. (B) Equilibrium fitness as 
a function of the synaptic costs. Inset: Average number of synapses of each 
neuron (network degree) against the cost per synapse. (C) Mean switching 
probabilities of the connected nodes against synaptic cost. Each point is 
an average at the stationary values (last 50 time points) and over the 77 
simulations. Parameters: n — 10, T — 7, S — 10, A = 0.01. 
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Synapse 



Figure 8: Timeline of synapses. 

Each row corresponds to a particular synapse between a given pair of neurons. 
Colors reflect the synapse length (only for visualization): blue: short lasting 
synapses; red: short lasting synapses. In this case the cost of synapses is 
k — 0; otherwise as in Fig. [TJ 
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Figure 9: Average lifetime of synapses, measured over one history. 

(A) Relationship between average and standard deviation of the synaptic 
lifetimes for 14 independent realisations. Colors indicate the synaptic costs 
k as in the inset legend. (B) Average lifetime as a function of the cost per 
synapse k. Different colors represent 14 different (independent) realisations. 
In both panels: squares indicate synapses established at early stages (before 
200 iterations) and the bullets: synapses established after reaching stationary 
states (later than 200 generations). Parameters as in Fig. [7j 
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A Hebb’s and Oja’s rule on the group ensem¬ 
ble 


In this appendix we want to show that we can approximate the ensemble of 
weights by their mean. We will show that under certain assumptions, Hebb’s 
and Oja’s rules apply on the average, plus some variance terms. We argue 
that the variance terms are small and thus the average learning is enough. 
This approximation allows a tractable analysis of the neurodynamics without 
needing to follow the full distribution of associative weights in the ensemble. 

We first work out the simpler case of Hebb’s rule (Eq. 1 in the main text) 
and then Oja’s rule (Eq. 2 in the main text). We interpret the change in the 
average weight as the average weight change, i.e. 


dt \ dt / 


( 12 ) 


where (...) denotes average on the ensemble population (for simplicity we will 
also use the ‘bar’ notation, e.g. 0 to denote the same quantity. The average 
is in principle taken on a joint distribution X], where the bolds indicate 
the vector of values (weights and neuronal states). However, we will make 
the simplifying approximation that all these quantities are independent. In 
population genetics this corresponds to the ‘Hardy-Weinberg’ equilibrium, 
which we take throughout this article. 


Mean-field Hebb’s rule From Eq. 1 in the main text we have that 


do,, 

dt 


A (XiYj). 


(13) 


57 



Using the definition of the activity, Y) = (fijkXk we have that 

) + (hixA . (14) 

We now use the independency assumption to further develop the two 
averages. The first term is 


\k^j 


{X i( j) jk X k ) — (Xi) (4>j k ) {X k ) — 4>j k (2p k — 1)(2 pi — 1) 


(in as long as i ^ j). The second term is 


(<feiV 2 ) = (4>n) (xf) = 4 tj . 


(15) 

(16) 


Now we put together the two expressions, complete the sum to include 
terms % and some algebra gives that 


d<Pji 

dt 


X 



X ) ^jk( 2 Pk - !) + <f>ji*PiO- 



(17) 


Note that (2 p — 1) is the average input activity and that 4p, : (l — p { ) is its 
variance. Then _ 

^ = XXiYi + A^,var(W), (18) 

where the notation Y is a shorthand for the output of the average activity 
(and not the average output, which is a correlation between X and 0). 

The last expression shows that, to a first order approximation, Hebb’s 
rule can be applied to the average. In most cases var(AQ) is small. Moreover, 
once the system has substantially or fully learned, then p = 0,1 making the 
second term vanish. 
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Mean-field Oja’s rule We proceed in a similar way as above, averaging 
over Oja’s rule (Eq. 2 in the main text). First note that there are two terms: 


dt 


A ((XiYj) - <feg 2 »; 


(19) 


the first terms is given by the right-hand side of Eq. 18 Thus we concentrate 
in working out the second term. We begin by expanding the Y 2 : 


( ( fijiYj ) — 2 {<pji(f)jk(t)jiXkXi) + {fiji&jkXk) ■ ( 20 ) 

k^l^j k^j 

Now we use the assumption of independency. However, we must note that 
not all the terms on the first sum are independent, bacause some symbols 
can repeat. We first write 


= 2 ( 2 Pfc - 1 )( 2 di - !) + (faitfk) ( 21 ) 

k^l^j k^j 


Now consider that 


{ 4 > ji 4 > jk ( PjI.) 


and in the second sum 


(</>%) 4 >ji itl^k = i 
($i) 0jk if k^l = i 


{fiji&jk) 


0 ji if k ± i 
if k = i 


( 22 ) 


(23) 


This separates the two sums into five sums. The rest of the derivation 
is somewhat lengthy but straightforward. It continues by completing the 
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missing terms of sums and then simply rearranging, and it leads to 


{(t)ji Y f) = ^Y) + var (fjjXiYj + ^ (</>%) (var(X fc ) - 1) 

k+i 

+ 0 i jVar(0 ji )(var(Ah) - 1) + <A^> (24) 

where (A<^) = (0 3 .) - 0^ (0^). 

Finally, we note that altogether we can write the mean held on Oja’s rule 
as the rule of the average activity: 

j _ _ _ _ 

—— = A Yj(X i — 4>jjYj) + A x variance terms. (25) 


B Mutual information contained in a synapse 

Mutual Information is defined as 


H[I,J]= Pr [Xi = r \Xj = s] Pr [Xj = s] log 

r,sS{0,l} 


"Pr [Xt = r |Xj = s ' 
Pr [Xi = r} 

(26) 

The unconditioned probabilities are simply the ‘allele’ frequencies. Namely, 


Pr \ X , = !] = Pi 
Pr l x i = 0] = 1 - Pj 


(27) 

(28) 


Conditional spiking and switching probabilities. To compute the 
conditional probabilities we evaluate the activity of the neuron i with the 
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conditioned value for the neuron j, namely 


Yi\j = hi ( 2 Pk - 1) + hi (2 Xj - 1). (29) 

k¥=iJ 

This leads to a conditioned switching probability Mju 

Mi\j = (l + exp p%]) -1 (30) 


This conditional switching probability leads to a dynamical equation with the 
new activity, whose equilibrium frequencies are Pr [X t — 1 \Xj — s] = pqj and 
Pr [Xj = 0 |Xj — .S'] = 1 ~Pi\j. The conditional frequency is by the equilibrium 
of 

= Pi\j (1 - Pi\j) d Pi log [W] + Mj\j (1 - 2p i \ j ) (31) 

For the directional selection case the solution is 


Pi\j 


s - 2M iU + + S 2 

2 S 


and for stabilising selection 



(32) 


(33) 


We evaluate on the evolved weights because we want to measure how one 
neuron is dependend on one other. Clearly another possible mutual infor¬ 
mation measure is to allow the whole network to relax with the constraint 
(including weights). However, this global measure asseses the capacity of the 
whole neuronal complex. By evaluating only the local mutual information 
we measure directly the information content contained in it. 
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Note that Y^j can be written as 

Y\j — 4>ki (2 Pk ~ 1 ) + <t>ji (2 Xj — 1 ) = 
k^i,j 

Y 4>ki (2 Pk - 1) + 4>ji (2 Xj - 1) - 4>ji (2 Pj - 1) 
k^i 

= (2 Pk — 1) + 20ji^p(js: fe ) (34) 

k^i 

with 8p(x k ) — Xk — Pk- Consequently, we can express the conditional switch¬ 
ing probability as 


A/, |. = (1 + exp [Yi\ exp Ax xYe ) - 


(35) 


We assume that both (j) 3 i and &P(x k ) are small and expand M t \ 3 in series of 
5p, to get 

M Aj = Mj, — 2 Mi (1 - Mi) (p l3 Sp 3 + 2 Mi (1 - M<) (1 - 2 M % ) S\^p] (36) 


Since this last expression does not depend explicitly on p^, the equilibrium 
value of pi\j is algebraically the same as that of p % but with Mj —y Mi\j (Eqns. 


32 and 33) 
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Approximating mutual information. 

terms that we can write explicitly: 


The sum in Eq. 26 resumes four 


H[I, J] = (1 - A|o) (1 - Pj) log 


(1 - pi\o) 


. (! - Pi) . 
A|o (! - Pj) log 


Pi\0 


+ (1 - Pi\ ± )pj log 


Pi 

(1 ~ Pi|l) 

. ( 1 - Pi) 


A|iPi log 


Pi\l 

L Pi 


(37) 


Substituting Eqns. [32 / 33 and 36 into the last formula gives the exact 
expression for H[I , J], which is lengthy and complicated. Using automated 
algebra software (Mathematica 10.0) we perform a series expansion on to 
second order. Using the fact that S » M we get that, in both selective 
scenarios: 

2M i M j 4> 2 ij 


H[I,J] = 


S 2 


+ O (M 2 ) 


(38) 


C Dependency of the dynamics on the initial 
conditions of neuronal spiking 

In this appendix we present results that show that once a system learns the 
landscape, the retrieval of the information is much quicker than learning, and 
it is independent of the initial conditions of the spiking probabilities. 

The way we assess this quick retrieval effect is by first solving the full 
dynamics. Once in equilibrium, we fix the Hebbian weights and compute the 
corresponding switching probabilities M; at each neuronal locus. 

After the learning phase has been completed, we run the dynamics for 
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the neuronal probabilities p>i with this fixed M/s but by choosing random 
initial spiking frequencies (uniformly in the interval (0,1)). We measure the 
state of the system relative to the fixed point of the learning neurodynamics, 
p*, and calculate the euclidian distance from this state: 


Dfp — 



Pi(r)} 2 . 


Because by time 1/S the pi have increased in representation, by time r = S 
( 1/S < S « 1/A) they are very close to the fixed point, and we expect Df p 
to be very close to zero. 

Figure 04 presents the results form simulations where we randomise 
initial conditions after learning. The figure shows how the average and the 
variance of the distance (over multiple realisations) to the fixed point quickly 
decays to zero, and by time r = S' it is already negligible. Figure p~0| 1 
shows the mean and variance of a set of ensembles under number of different 
neuronal loci n and different fitness landscapes. 


D Scaling of selection intensity and associa¬ 
tive weights 


We show here the scaling of Fig. 3 in the main text. 

First, consider that of all neuronal loci there is one that has the maximum 
switching probability, which we denote with the subscript m. We denote the 
relative switching probability as Mi, which by the definition of M is 


Mi 1 + e Em 
M m 1 + e Ei 


(39) 
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Figure 10: Effect of initial conditions on the dynamics of learning. 

(A) Eucledian distance to the fixed point Df p of the learning dynamics 
(red) and the average on an ensemble (100 samples) of dynamics with learnt 
weights (black) with random initial conditions for the spiking probabilities. 
The shadowed region in grey covers the standard deviation. In this ex¬ 
ample n = 10 ; the landscape is asymmetric with gradients S) given by 
{3.1,3.2,4.6,4.8,4.8,4.9, 6.2, 7.8, 8.1, 9.9}. (B) Mean and variance of the dis¬ 
tance to the fixed point Df p of an ensemble of simulations under different 
numbers of neuronal loci n (see inset legend) at time r = S = 10. Each 
point is an average over 30 runs with learnt weights and randomised initial 
conditions for the spiking probabilities with distribution f/(0,1). For each 
of these ensembles the landscape is ^random vector with each component 
distributed as 7/(0, S]. Other parameters as in Fig. 3 in the main text. 







As a first approximation we note that for large activity values 


Mi ~ exp (E m - Ei ) . (40) 

The activity difference A* = E m — E{ is: 

A* = E m - Ej = 4>mj{2pj ~ 1) ~ VI <t>ik(2pk - 1) . (41) 

jy^m k^i 

The two sums overlap at all but two indices: j = i and k = m, respectively. 
Hence 


A i = (j)mi{2pi - 1) - 4>im{2pm ~ 1) + ^ (4>mj - </>ij)(2pj ~ 1) • (42) 

j^m,i 

As a second approximation we assume that weights are of the order (j) ~ 
(n — 1) _ 5. Consequently the sum is negligible and 


Aj ~ 2 P ~pJ^ . (43) 

y/n - 1 

The equilibrium condition of the neurodynamical equation gives 

P=^(VV + l-2/z + l) (44) 

where p = M/S. Assuming that S » M we have that to hrst order in p, 
p ~ 1 — p. Thus, 

A* ~ . (45) 

y/n — 1 

A third approximation follows: since p. m is the term with the largest S, 
then, compared to most p i: it is negligible. Therefore Aj ~ —2 Pi/y/n — 1. 
Summarising, the relative switching probability is: 


Mi ~ exp 


2 pi 

y/n - 1 


(46) 
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Now, simply call S = y/n — 1 S/M, so we get the scaled relationship 


Mi — exp 



(47) 


E Effect of the topology of the network 

The number of neurons in the brain is vast; yet it is very sparsely connected, 
with a modular nature. Empirical measurements suggest small world topolo¬ 
gies. However, what lies behind this distribution, and how truly ‘random’ 
it is, remains unclear. From the point of view of our framework, we can 
evaluate what effect different topologies have on the neurodynamics. 

We can interpret the random topologies in two ways. First, under the 
Changeux selective (but not evolvable) scenario where neuronal circuits are 
fixed, we can think that from the point of view of an arbitrary cognitive task, 
any pre-established circuitry is essentially random. Second, under the ‘neu¬ 
ronal replicator hypothesis’ , we can think that once some neuronal groups 
allocated for a new task, has a current state of the circuitry optimised for a 
previous task. However from the point of view of the new task, the current 
configuration is random and potentially uninformative. 

In this supporting text, we study how random topologies affect the neu¬ 
rodynamics. We consider learning networks that have topologies drawn from 
random network distributions, but which remain constant during the learning 
process. For any given random topology we run the selection-learning system 
on a stabilising landscape with a randomly placed optimum value (which also 
remains constant during a run). We consider systems with between 5 to 30 
neuronal loci. 

We first evaluate the effect of Erdos-Renyi (ER) topologies. These are 
random networks built simply by placing random connections amongst two 
nodes with a probability p. Although we do not have any reason to believe 
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Switching probability, M Node degree 


Figure 11: Distribution of switching probabilities of neuronal circuits 
with Erdos-Renyi topologies. 

(A) Empirical probability function on the outcome of the selection-learning 
process. (B) Relationship between switching probability and the degree of 
each neuronal locus. In each simulation we randomise the initial conditions 
(as in the figures in the main text), the position of the optimum (uniformly 
in the real interval [1, n — 1]) and the learning network (using a random p). 
For each n we perform 1000 simulations. A = 0.01. 

that this ER network distribution is realised in neuronal circuits, it is the 
most basic pattern of random networks, providing a baseline of uncertainty. 
Figure [TT|A shows the distribution of the equilibrium switching probabilities 
M under ER distributions for different number of neuronal loci. The pattern 
is clear: most neurons spike in a random, unspecihed way. This is reflecting 
the fact that the systems are unable to find solutions. Figure [TTjB shows the 
dependency of the switching probabilities against the degree of each neutron 
(numbers of synaptic connections with other neurons). Only as the degree 
increases some neurons can be more specific. However, generally speaking 
the spread the spread of the values is high. 

We can compare these patterns with that of fully connected neuronal 
circuits (Fig. [I2| . In this case the switching probabilities are bimodal with 
peaks close to M = 0 and M= 1, indicating highly specific firing, as argued 
in the main text. 

We also perform a similar analysis with Watts-Strogats (WS) ‘small 
world’ topologies. These kinds of networks have become popularised through 



































Switching probability, M 


Figure 12: Distribution of switching probabilities for fully connected 
neuronal circuits. 

As in figure [lTJ but with using fully connected networks, n as in the inset of 
Fig. [TljA. 

the “six degrees of separation” motto. The key property of small world net¬ 
works is that any node can be reached from any other by connecting through 
a small number of intermediary nodes. Unlike the ER networks which often 
have disconnected nodes, the WS networks are very well connected. Some 
works have reported that neuronal networks have this kind of properties []. 

Figure [l3| 4 shows that under this random network models, there is an 
almost uniform distribution of switching probabilities M, although there are 
almost no circuits with neurons that fire specifically. This occurs because 
although the circuits as a whole are well connected, each neuron has only 
very few connections (Fig. [13^3) , which in turn, results in low activities at 
most neurons. 

Figures |T4|fT5l and shows the neurodynamics assuming a random networks 
for the circuit topology. These examples show that in both cases the systems 
are unable to learn do to the constraints imposed by the network topology. 
We see that most neurons reach a state where they do not fire specifically. 
We see this learning process that is unable to learn on an existing circuitry 
as a state of impasse as defined in cognitive psychology: any alternative that 
can be reached under the current configuration or representation leads to an 
even worse ‘solution’. Algorithmically, this corresponds to a local optimum 
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Figure 13: Distribution of switching probabilities of neuronal 
with Watts-Strogatz small world topologies. 

As in figure 11, but with Watts-Strogatz deviates with a random 


probability, n as in the inset of Fig. Eh 


circuits 

rewiring 


on the fitness landscape.! 


F Dynamics of structural plasticity under synap¬ 
tic costs 


Figure 16 shows the evolution of fitness for neuronal systems targeting dif¬ 
ferent values T. Although the mean fitness decreases with T, this has no 
consequences on the outcome of the system. This is because the success of 
neuronal group only depends on the relative advantage to other neuronal 
groups with different configuration. The relevant pattern is the decrease of 
performance as the synaptic costs increase. 


G Distribution of the evolved networks 

As reported in the main text, the neuronal circuits become less connected 
as the cost per synapse k increases. Figure [17] shows that the distribution of 
degrees (number of synapses per neuron) becomes more skewed as k becomes 
larger. 
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Figure 14: Neurodynamics under random (Erdos-Renyi) circuit 
topologies. 

(A) Neurodynamics (inset: fitness) (B) Switching probability (inset: associa¬ 
tive weights). (C) Topology of the neuronal circuit. Parameters as in Figs. 
2,4 of the main text. 




Figure 15: Neurodynamics under random (Watts-Strogatz) circuit 
topologies. 

(A) Neurodynamics (inset: fitness) (B) Switching probability (inset: associa¬ 
tive weights). (C) Topology of the neuronal circuit. Parameters as in Figs. 
2,4 of the main text. 
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Figure 16: Neuronal systems under costly synapses 

(A-D) Dynamics of the fitness (relative to the maximum) of neuronal systems 
under different cost per connection (see legend in A) and different target val¬ 
ues (A: 4; B: 5; C:10). Each curve is a replica of 16 independent simulations. 
(D) Equilibrium fitness as a function of the synaptic costs. Each point is an 
average at the stationary values (last 50 time points) and over 16 simula¬ 
tions (except for T = 7 which is the data in Fig 7 int he main text). Filled 
diamonds T — 4; open diamonds T — 5; filled bullets T — 7; open bullets 
T = 10. Parameters as in Fig. 7 in the main text. 
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Figure 17: Degree distribution of the evolved neuronal circuits. 

The data corresponds to the simulations of Fig. in the main text. The 
histograms are constructed by taking the last 50 points of 112 simulations 
and pooling the degrees of all networks. Since each circuit consists of n = 10 
neuronal loci, there are in total 55000 data points in each histogram. 


Note that there are always zero classes because the target value T deter¬ 
mines that the optimal circuit consist of T neurons that are fully connected 
(particularly with high in-degree) and n — T that are either disconnected or 
have zero in-degree. Consequently, because this sets a lower bound for the 
number of unconnected neurons. The distribution of degrees is dependent 


on both the synaptic cost and the target values (Fig. 18) . 

We tried fitting several distributions: Erdos-Renyi, Watts-Strogatz, Pois¬ 
son, Binomial, and scale free networks, and none fits well the empirical dis¬ 
tributions. 
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Figure 18: Mean degree of the evolved neuronal circuits for different 
targets and synaptic costs. 

Each point is an average at the stationary values (last 50 time points) and 
over 16 simulations (except for T = 7 which is the data in Fig. 7 int he main 
text). Filled diamonds T — 4; open diamonds T — 5; filled bullets T — 7; 
open bullets T = 10. Parameters as in Fig. 7 in the main text. 
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H Distribution of synaptic lifetimes 


In this Appendix we present further results on the distribution of synaptic 
lifetimes. We ran extensive simulations with an alternative target values 
from that on the text, T = 4. (Due to prohibitively long computation times 
we chose to sacrifice analysing more target values in order to focus on one 
value, but with sufficient statistical power.) 


Figure 19 shows results for pooled samples of synaptic lifetimes of 14 
simulations each lasting 1000 steps. We partition the synaptic lifetimes into 
‘early (learning) period’, and ‘late (post-learning) period’. At the learning 
period, there seems to be no relationship between the average lifetime and 
the standard deviation of the synaptic lifetimes, whereas for the pos-learning 


stationary period there is a clear trend (Fig. 19). This trend seems to be 
independent of the target values. The average synaptic lifetimes seem to be 
largely independent of the synaptic costs (Fig. [T9)3). 

Despite the fact that there is a clear difference between the synapses 
established early and those established late, Mann-Whitney tests on the each 
of the different simulations give p —values that are largely variable, and very 
often fail to reject the null hypothesis that the distributions of early and late 
synapses are distinct (Fig. 20). However, pooling the replicate simulations 
leads, in all cases to reject the null hypothesis (Table [2]). 
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Figure 19: Synaptic lifetimes for different costs and targets 

(A) Relationship between the average and standard deviation of the synaptic 
lifetimes. (B) In both panels: bine T — 4, red T — 7; rectangles: synapses 
established during early, learning periods (before 200 iterations); bullets: 
synapses established in post-learning periods (at or after 200 iterations). 
The data is pooled from 14 independent runs each with of 1000 iterations. 
Other parameters as in Fig. 9 in the main text. 


Table 2: p— values of pooled simulations 


Cost \Target 

T 

= 4 

T = 7 

k 

= 0.010 

1 . 

x 10~ 2 

4 . x 10" 4 

k 

= 0.018 

4 . 

x 10~ 5 

2 . x nr 2 

k 

= 0.034 

4 . 

x 10 -4 

3 . x 10” 5 

k 

= 0.063 

3. 

x 10~ 5 

8 . x 10~ 5 

k 

= 0.117 

3. 

x 10~ 9 

5 . x 10~ 8 

k 

= 0.215 

2 . 

x 10- 11 

5 . x 10” 6 

k 

= 0.398 

2 . 

x 10~ 15 

i . x hr 8 

k 

= 0.736 

5. 

x 10~ 12 

1. x hr 6 


Test as in Fig. 20 but with pooled samples. 
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Figure 20: Significance of the difference of synaptic lifetimes during 
and after learning 

The plots show the p —values for Mann-Whitney tests (note the log-scale); 
all with confidence 0.05. Null hypothesis Hq : the mean of the distributions 
of synaptic lifetimes during learning and post-learning periods are different. 
Alternative hypothesis: H a the mean of the distributions are different. Marks 
at p — 0.01 actually indicate that p < 0.01. The data corresponds to the 
simulations of the previous figure (and also Fig. 9 of the main text). 
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